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Tight Global Linear Convergence Rate Bounds 
for Douglas-Rachford Splitting 

Pontus Giselsson 


Abstract. Recently, several authors have shown local and global conver¬ 
gence rate results for Douglas-Rachford splitting under strong mono¬ 
tonicity, Lipschitz continuity, and cocoercivity assumptions. Most of 
these focus on the convex optimization setting. In the more general 
monotone inclusion setting, Lions and Mercier showed a linear conver¬ 
gence rate bound under the assumption that one of the two operators is 
strongly monotone and Lipschitz continuous. We show that this bound is 
not tight, meaning that no problem from the considered class converges 
exactly with that rate. In this paper, we present tight global linear con¬ 
vergence rate bounds for that class of problems. We also provide tight 
linear convergence rate bounds under the assumptions that one of the 
operators is strongly monotone and cocoercive, and that one of the op¬ 
erators is strongly monotone and the other is cocoercive. All our linear 
convergence results are obtained by proving the stronger property that 
the Douglas-Rachford operator is contractive. 
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1. Introduction 

Douglas-Rachford splitting [ 12 ], 24] is an algorithm that solves monotone 
inclusion problems of the form 

0 e Ax + Bx 

where A and B are maximally monotone operators. A class of problems that 
falls under this category is composite convex optimization problems of the 
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form 

minimize f(x) + g(x) (1.1) 

where / and g are proper, closed, and convex functions. This holds since the 
subdifferential of proper, closed, and convex functions are maximally mono¬ 
tone operators, and since Fermat’s rule says that the optimality condition for 
solving d) is 0 € df(x ) + dg(x), under a suitable qualification condition. 
The algorithm has shown great potential in many applications such as signal 
processing [B], image denoising [32], and statistical estimation [5] (where the 
dual algorithm ADMM is discussed). 

It has long been known that Douglas-Rachford splitting converges un¬ 
der quite mild assumptions, see [HI Eli E]. However, the rate of conver¬ 
gence in the general case has just recently been shown to be 0(1/Vk) for 
the fixed-point residual, mama m- For general maximal monotone opera¬ 
tor problems, where one of the operators is strongly monotone and Lipschitz 
continuous, Lions and Mercier showed in [24] that the Douglas-Rachford al¬ 
gorithm enjoys a linear convergence rate. To the author’s knowledge, this 
was the sole linear convergence rate results for a long period of time for 
these methods. Recently, however, many works have shown linear conver¬ 
gence rates for Douglas-Rachford splitting and its dual version, ADMM, 
see, The works 

in m m uni a m concern local linear convergence under different assump¬ 
tions. The works in Tl [22] GJ3] consider distributed formulations, while the 
works in [81 HU H5l [281 El EH Ezl HZl HSi HI E] show global convergence rate 
bounds under various assumptions. Of these, the works in [HI IH Hi show tight 
linear convergence rate bounds. The works in mm show tight convergence 
rate results for problem of finding a point in the intersection of two subspaces. 
In [ITS] it is shown that the linear convergence rate bounds in m (which are 
generalizations of the bounds in m) are tight for composite convex opti¬ 
mization problems where one function is strongly convex and smooth. All 
these results, except the one by Lions and Mercier, are stated in the convex 
optimization setting. In this paper, we will provide tight linear convergence 
rate bounds for monotone inclusion problems. 

We consider three different sets of assumptions under which we pro¬ 
vide linear convergence rate bounds. In all cases, the properties of Lipschitz 
continuity or cocoercivity, and strong monotonicity, are attributed to the op¬ 
erators. In the first case, we assume that one operator is strongly monotone 
and the other is cocoercive. In the second case, we assume that one operator 
is both strongly monotone and Lipschitz continuous. This is the setting con¬ 
sidered by Lions and Mercier in [24] . where a non-tight linear convergence 
rate bound is presented. In the third case, we assume that one operator is 
both strongly monotone and cocoercive. We show in all these settings that 
our bounds are tight, meaning that there exists problems from the respective 
classes that converge exactly with the provided rate bound. In the second and 
third cases, the rates are tight for all feasible algorithm parameters, while in 
the first case, the rate is tight for many algorithm parameters. 
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2. Background 

In this section, we introduce some notation and define some operator and 
function properties. 

2.1. Notation 

We denote by R the set of real numbers and by R := R U {oo} the extended 
real line. Throughout this paper, R denotes a separable real Hilbert space. 
Its inner product is denoted by (•,•), its induced norm by || • ||. We denote 
by {4>i}iL\ any orthonormal basis in R, where K is the dimension of R 
(possibly oo). The gradient to / : X —> R is denoted by V/ and the 

subdifferential operator to / : X —> R is denoted by df and is defined as 
df(x i) := {u | f(x 2 ) > f(x 1 ) + (u,x 2 — X\) for all X 2 £ X}. The conjugate 
function of / is denoted and defined by f*(y) = sup,,, {{y,x) — f(x)}. The 
power set of a set X , i.e., the set of all subsets of X , is denoted by 2 X . The 
graph of an (set-valued) operator A : X —> 2 y is defined and denoted by 
gphA = {{x,y) £ X x y | y £ Ax}. The inverse operator A~ l is defined 
through its graph by gphA -1 = {(y,x) £ y x X \ y £ Ax}. The identity 
operator is denoted by Id and the resolvent of a monotone operator A is 
defined and denoted by Ja = (Icl +A)” 1 . Finally, the class of closed, proper, 
and convex functions / : R —> R is denoted by r 0 ("H). 

2.2. Operator properties 

Definition 2.1 (Strong monotonicity). Let a > 0. An operator A : 'H 2 W 
is a-strongly monotone if 

(u-v,x-y)> o\\x - y\\ 2 

holds for all (x,u) £ gph(A) and (y,v) £ gph(A). 

The operator is merely monotone if er = 0 in the above definition. 
In the following three definitions, we state some properties for single-valued 
operators T : H —> W. We state the properties for operators with full domain, 
but they can also be stated for operators with any nonempty domain T> C.'H. 

Definition 2.2 (Lipschitz continuity). Let /3 > 0. A mapping T : H — > H is 

/ 3-Lipschitz continuous if 

\\Tx — Ty\\ < @\\x-y\\ 

holds for all x,y £ Ti. 

Definition 2.3 (Nonexpansiveness). A mapping T : 7~L —> H is nonexpansive 
if it is 1-Lipscliitz continuous. 

Definition 2.4 (Contractiveness). A mapping T : 7~L —> H is 5-contractive if 
it is d-Lipschitz continuous with S £ [0,1). 

Definition 2.5 (Averaged mappings). A mapping T : H —> 7~L is a-averaged 
if there exists a nonexpansive mapping R : R —> R and cc £ (0, 1) such that 
T = (1 — a)Id + cxR. 
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From [3} Proposition 4.25], we know that an operator T : TL —► TL is 
a-averaged if and only if it satisfies 

^11 (Id - T)x - (Id - T)yf + || Tx - Tyf < \\x - y\\ 2 (2.1) 

for all x, y £ TL. 

Definition 2.6 (Cocoercivity). Let f3 > 0. A mapping T : TL —> TL is 4- 
cocoercive if 

{Tx-Ty,x-y) > ±\\Tx - Tyf 

holds for all x,y £ TL. 

3. Preliminaries 

In this section, we state and show preliminary results that are needed to 
prove the linear convergence rate bounds. We state some lemmas that de¬ 
scribe how cocoercivity, Lipscliitz continuity, as well as averagedness relate 
to each other. We also introduce negatively averaged operators , T, that are 
defined by that — T is averaged. We show different properties of such op¬ 
erators, including that averaged maps of negatively averaged operators are 
contractive. This result will be used to show linear convergence in the case 
where the strong monotonicity and Lipschitz continuity properties are split 
between the operators. 

3.1. Useful lemmas 

Proofs to the following three lemmas are found in Appendix lAl 

Lemma 3.1. Assume that (3 > 0 and let T : TL —> Ti. Then cocoercivity 
of /lid + T is equivalent to fd-Lipschitz continuity ofT. 

Lemma 3.2. Assume that /3 £ (0,1). Then cocoercivity of R : TL —> TL is 
equivalent to ^-averagedness of T = R + (1 — /3)Id. 

Lemma 3.3. Let T : TL TL be S-contractive with S £ [0,1). Then R = 
(1 — a)Id + aT is contractive for all a £ (0, yyy). The contraction factor is 
|1 — a\ + a.5. 

For easier reference, we also record special cases of some results in [3] 
that will be used later. Specifically, we record, in order, special cases of [3 S 
Proposition 4.33], [3J Proposition 4.28], and [31 Proposition 23.11]. 

Lemma 3.4. Let /? £ (0,1) and letT : TL —»• TL he cocoercive. Then (Id—T) 

is ^-averaged. 

Lemma 3.5. Let T : TL —> TL be a-averaged with a £ (0, |). Then (2 T — Id) 
is 2 a-averaged. 

Lemma 3.6. Let A : TL —> be maximally monotone and cr-strongly mono¬ 

tone with a > 0. Then Ja = (Id + A) -1 is (1 + a)-cocoercive. 
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3.2. Negatively averaged operators 

In this section we define negatively averaged operators and show various 
properties for these. 

Definition 3.7. An operator T : R —> R is 0-negatively averaged with 
9 e (0,1) if — T is 0-averaged. 

This definition implies that an operator T is 0-negatively averaged if 
and only if it satisfies 

T = -((1 - 0)ld + 0R) = ( 9 - l)Id + 0R 

where R is nonexpansive and R := —R is therefore also nonexpansive. Since 
— T is averaged, it is also nonexpansive, and so is T. 

Since negatively averaged operators are nonexpansive, they can be av¬ 
eraged. 

Definition 3.8. An a-averaged 0-negatively averaged operator S : R —> R is 
defined as S = (1 — a)Id + aT where T : R —> R is 9- negatively averaged. 

Next, we show that averaged negatively averaged operators are contrac¬ 
tive. 

Proposition 3.9. An a-averaged 9-negatively averaged operator S : TL —> R 
is |1 — 2a + ctQ\ + aO-contractive. 

Proof. Let T = (0— l)ld + 0f? (for some nonexpansive R) be the 0-negatively 
averaged operator, which implies that S = (1 — a)Id + aT. Then 

HSa: — Sy|| = ||((1 — a)Id + aT) x — ((1 — a)Id + aT)y\\ 

= ||(1 — 2a + ad)(x — y) + aO{Rx — i?y))|| 

< |1 - 2 a + a9\\\x - y\\ + a0||x - y\\ 

= (|1 — 2a + a9\ + a0)||x — y\\ 

since R is nonexpansive. It is straightforward to verify that |1 — 29+a9\+a9 < 
1 for all combinations of a £ (0,1) and 9 £ (0,1). Hence, S is contractive 
and the proof is complete. □ 

Next, we optimize the contraction factor w.r.t. a. 

Proposition 3.10. Assume that T : R —> R is 9-negatively averaged. Then 
the a that optimizes the contraction factor for the a-averaged 9-negatively 
averaged operator S = (1 — a)ld + aT is a = jzrs- The corresponding optimal 

f) 

contraction constant is ^~- 

Proof. Due to the absolute value, Proposition 13.91 states that the contraction 
factor 6 of T can be written as 

1 — 2 a + a9 + a9 if a < 

— (1 — 2a + a9) + a9 if a > 



l 

2-9 

l 

2-9 


2 ^e J 1 — 2(1 — 9)a if a < 53 ^ 


2a- 1 


if a > jze 
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where the kink in the absolute value term is at a = Since 9 £ (0,1), we 
get negative slope for a < and positive slope for a > j^g. Therefore the 
optimal a is in the kink at a = which satisfies a £ (1, 1) since 9 £ (0,1). 
Inserting this into the contraction factor expression gives -^zg • This concludes 
the proof. □ 

Remark 3.11. The optimal contraction factor is strictly increasing in 9 
on the interval 9 £ (0,1). Therefore the contraction factor becomes smaller 
the smaller 9 is. 


We conclude this section by showing that the composition of an averaged 
and a negatively averaged operator is negatively averaged. Before we state 
the result, we need a characterization of ^-negatively averaged operators T. 
This follows directly from the definition of averaged operators in (12.111 since 
—T is 0 -averaged: 

l^\\^ + T)x-{ld + T)yf + \\Tx-Tyf< \\x-yf. (3.1) 


Proposition 3.12. Assume that Tq : Ti —»• TL is 6-negatively averaged and 
T a : TL —> TL is a-averaged. Then ToT a is -negatively averaged where 


K = 


1-6 


a 

1—a ’ 


Proof. Let Kg = -33 and K a = jz ^, then k = Kg + K a . We have 
|| (Id + TgT a )x — (Id + TgT a )y\\ 2 

= \\(x-y)~ ( T a x - T a y) + ( T a x - T a y) + ( T g T a x - TgT a y )|| 2 
= || (Id - T a )x - (Id - T a )y + (Id + Tg)T a x - (Id + T e )T a y\\ 2 

< *£*11 (Id - T a )x - (Id - T a )y\\ 2 

+ *£*||(Id + Tg)T a x — (Id + Tg)T a y\\ 2 

< (Kg + K a )(\\x - y || 2 - ||T a x - T a y\\ 2 ) 

+ (Ke + K a )(\\T a x - T a y\\ 2 - II TgT a x - TgT a y\\ 2 ) 

= (Kg + rta)(||x - y\\ 2 - || TgT a X - TgT a y\\ 2 ) 

= Kfllz - y || 2 - \\TgT a x - TgT a y\\ 2 ) (3.2) 

where the first inequality follows from convexity of || • || 2 . More precisely, let 
t £ [0,1], then, by convexity of || - || 2 , we conclude that 

Ik + yf = II t\x + (1 - t)^- t y\\ 2 < t||ix || 2 + (1 - t)^- t \\ y \\ 2 

= iINI 2 + i^IMI 2 - 

Letting t = ^ £ [0,1], which implies that 1 — t = £ [0,1], gives the 

first inequality in (13.21) . The second inequality in (13.21) follows from (12.11) and 
EH). The relation in (13.21) coincides with the definition of negative averaged- 
ness in (13.11) . Thus TgT a is (f -negatively averaged with <f> satisfying = d. 
This gives <j> = and the proof is complete. □ 
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Remark 3.13. This result can readily be extended to show aver agedness of 
T = T 1 T 2 • • • T/v where T) are ay-(negatively) averaged for i = 1 
We get that T is j^-negatively averaged with k = ^ an °dd 

number of the T):s are negatively averaged, and that T is y^-averaged if an 
even number of the T) are negatively averaged. Similar results have been and 
presented, e.g., in [5] Proposition 4.32] which is improved in [7j. Our result 
extends these results in that it allows also for negatively averaged operators 
and reduces to the result in [7j for averaged operators. 

4. Douglas-Rachford splitting 

Douglas-Rachford splitting can be applied to solve monotone inclusion prob¬ 
lems of the form 

0 £ Ax + Bx (4.1) 

where A, B : T-L —> 2 n are maximally monotone operators. The algorithm 
separates A and B by only touching the corresponding resolvents , where the 
resolvent Ja ■ W —> R is defined as 

J A :=(^ + Id)" 1 . 

The resolvent has full domain since A is assumed maximally monotone, see 
PS] and [3] Proposition 23.7]. If A = df where / is a proper, closed, and 
convex function, then Ja = proxy where the prox operator proxy is defined 
as 

Proxy (z) = argmin {/ (x) + \\\x - z\\ 2 } . (4.2) 

X 

That this holds follows directly from Fermat’s rule [3, Theorem 16.2] applied 
to the proximal operator definition. 

The Douglas-Rachford algorithm is defined by the iteration 

z k + x = ((l - a )Id + aR A R B )z k (4.3) 

where a € (0,1) (we will see that also a > 1 can sometimes be used) and 
Ra '■ R —> R is the reflected resolvent , which is defined as 

Ra := 2J^4 — Id. 

(Note that what is traditionally called Douglas-Rachford splitting is when 
a = 1/2 in (14.31) . The case with a = 1 in (14.31) is often referred to as 
the Peaceman-Rachford algorithm, see PH]. We will use the term Douglas- 
Rachford splitting for all feasible choices of a.) 

Since the reflected resolvent is nonexpansive in the general case [?, 
Corollary 23.10], and since compositions of nonexpansive operators are non¬ 
expansive, the Douglas-Rachford algorithm is an averaged iteration of a non¬ 
expansive mapping when a £ (0,1). Therefore, Douglas-Rachford splitting is 
a special case of the Krasnosel’skii-Mann iteration [251123] . which is known to 
converge to a fixed-point of the nonexpansive operator, in this case RaRb , 
see [3] Theorem 5.14], Since an x £ Ti solves (14.11) if and only if x = Jaz 
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where z = RaRbz, see [3, Proposition 25.1] this algorithm can be used to 
solve monotone inclusion problems of the form HI]). Note that to solve HU), 
is equivalent to solving 

0 € jAx + "fBx 

for any 7 G (0, 00 ). Then we can define A 1 = 7 A and (14.11) can also be solved 
by the iteration 

z k+1 = ((l-a)Id + aR Ay RB y )z k . (4.4) 

Therefore, 7 is an algorithm parameter that affects the progress of the iter¬ 
ations. 

The objective of this paper, is to provide tight linear convergence rate 
bounds for the Douglas-Rachford algorithm under various assumptions. Using 
these bounds, we will show how to select the algorithm parameters 7 and a 
that optimize these bounds. The first setting we consider is when A is strongly 
monotone and B is cocoercive. 


5. A strongly monotone and B cocoercive 

In this section, we show linear convergence for Douglas-Rachford splitting in 
the case where A and B are maximally monotone, A is strongly monotone, 
and B is cocoercive. That is, we make the following assumptions. 

Assumption 5.1. Suppose that: 

(i) A : R —> 2 W is maximally monotone and a-strongly monotone. 

(ii) B : R —» R is maximally monotone and jj-cocoercive. 

Before we can state the main linear convergence result, we need to 
characterize the properties of the resolvent, the reflected resolvent, and the 
composition between reflected resolvents. This is done in the following series 
of propositions, this first of which is proven in Appendix iBl 

Proposition 5.2. The resolvent Jb of a ^-cocoercive operator B : R — »• R 
is 2 (i+p) -averaged. 

This implies that also the reflected resolvent is averaged. 

Proposition 5.3. The reflected resolvent of a 4 -cocoercive operator B : R — »• 
R is j^-a- averaged. 

Proof. This follows directly from the Proposition 15.21 and Lemma 13.51 □ 

If the operator instead is strongly monotone, the reflected resolvent is 
negatively averaged. 

Proposition 5.4. The reflected resolvent of a a-strongly monotone and maxi¬ 
mal monotone operator A : R —> 2^ is yyy -negatively averaged. 
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Proof. From Lemma RTBl we have that the resolvent Ja is (1 + <r)-cocoercive. 
Using Lemma £20 this implies that Id — Ja is 2 (i+o-) ~ avera g e( T Then using 
Lemma [3.51 this implies that 2 (Id — Ja) — Id = Id — 2 Ja = —Ra is 
averaged, hence Ra is yT_-negatively averaged. This completes the proof. □ 

The composition of the reflected resolvents of a strongly monotone op¬ 
erator and a cocoercive operator is negatively averaged. 

Proposition 5.5. Suppose that Assumvtion \5 . 1\ holds. Then, the composition 

—+p 

RaRb is —S- negatively averaqed. 

1+-+/3 

Proof. Since Ra is -pj^-negatively averaged and Rb is yq^-averaged, see 

Propositions 15.31 and 15.41 we can apply Proposition 13.121 We get that k = 
1 P 

1+r { -1- 1+ jj = T + /3 and that the averagedness parameter of the neg- 

1_ l+o- 1_ TT/3 

—+/3 

atively averaged operator RaRb is given by qqb- = -y 2 ——. This concludes 
the proof. □ 


With these results, we can now show the following linear convergence 
rate bounds for Douglas-Rachford splitting under Assumption 15.11 The the¬ 
orem is proven in Appendix [B] 


Theorem 5.6. Suppose that Assumvtion \5. l\ holds, that a £ (0,1), that 7 £ 
(0, 00 ), and that the Douglas-Rachford algorithm (14. 3p is applied to solve 
0 £ 7 Ax + "/Bx. Then the algorithm converges at least with rate factor 


1 — 2a + a 7 ° 1 


+7/3 


i+^+7/3 


;+7/3 


1+ —+7/3 

7<r 


(5.1) 


Optimizing this rate bound w.r.t. a and 7 gives 7 = and a = ^^=3. 

y/P/v 


The corresponding optimal rate bound is 


/P/a+l 


5.1. Tightness 

In this section, we present an example that shows tightness of the linear 
convergence rate bounds in Theorem 15.61 for many algorithm parameters. 
We consider a two dimensional Euclidean example, which is given by the 
following convex optimization problem: 

minimize f(x) + f*(x) (5.2) 


where 


f(x) = §£?, (5.3) 

and x = ( 21 , 2 : 2 ), and /3 > 0. The gradient V/ = so it is cocoercive 
with factor 4. According to (3 Theorem 18.15] this is equivalent to that f* 
is ^-strongly convex and therefore df* is a := ^-strongly monotone. 
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The following proposition shows that when solving (15.21) with / defined 
in (15.31) using Douglas-Rachford splitting, the upper linear convergence rate 
bound is exactly attained. The result is proven in Appendix [B] 

Proposition 5.7. Suppose that the Douglas-Rachford algorithm is applied 
to solve m with f in (1531) . Further suppose that the parameters 7 and a 
satisfy 7 £ ( 0 , 00 ) and a £ [c, 1 ) where c = an< ^ z ° = ( 0 , ^ 2 ) 

with z 2^0. Then the z k sequence in (TOl) converges exactly with rate (ED 
in Theorem HOI 

So, for all 7 parameters and some a parameters, the provided bound is 
tight. Especially, the optimal parameter choices 7 = and a = 

give a tight bound. 

It is interesting to note that although we have considered a more general 
class of problems than convex optimization problems, a convex optimization 
problem is used to attain the worst case rate. 


1 + 2 

2(1 +y/JJ^) 


5.2. Comparison to other bounds 

In Eli II was shown that Douglas-Rachford splitting converges as 


y/P/a-l 

yfpjo+i 


when solving composite optimization problems of the form 0 £ 7 V/ + "/dg, 
where V/ is c-strongly monotone and -g-cocoercive and the algorithm pa¬ 
rameters are chosen as a = 1 and 7 = In our setting, with df be¬ 

ing tr-strongly monotone and dg being -g-cocoercive, we can instead pose 
the equivalent problem 0 £ 71 df(x) + "jdg(x) where / = / — f || • || 2 and 
g = g + || • || 2 . Then df is merely monotone and g is a-strongly monotone 

and -gA_-cocoercive. For that problem, m shows a linear convergence rate 

of at least rate L (when optimal parameters are used). This rate 

y/(P+cr)/cr+l 

turns out to be better than the rate provided in Theorem 15.61 i.e. 


/P/<T 


/fi/a+l ’ 

which assumes that the strong convexity and smoothness properties are split 
between the two operators. This is shown by the following chain of equiva¬ 
lences which departs from the fact that the square root is sub-additive, i.e., 
that y[JT-Vo < yfcf + yfd f° r /3, o > 0 : 


\J 13 + a - sfa < i//3 
<(=> \/(/? + o )/<j — 1 < y/0/<r 

^ y/{P+^)/a- 1 < y/P/o {< y/P/° \ 

y/{P+°)/cr+ 1 _ y/(p+o)/a +1 y/P/a+1 J 


This implies that, from a worst case perspective, it is better to shift both 
properties into one operator. This is also always possible, without increasing 
the computational cost in the algorithm, since the prox-operator is just shifted 
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slightly: 


prox 7 / -( 2 ) = argmin |/ (x) + ±\\x - z|| 2 | 

= s,rgmm{f(x)-z\\x\\ 2 + ±\\x-z\\ 2 } 

= argmin {/(a?) + ^ ||z - j^z\\ 2 } 

= Prox 7 f {j=z^z). 

1—7 cr J 

A similar relation holds for prox 7 g with the sign in front of ycr flipped. 


6. A strongly monotone and Lipschitz continuous 

In this section, we consider the case where one of the operators is cr-strongly 
monotone and /3-Lipschitz continuous. This is assumption is stated next. 

Assumption 6.1. Suppose that: 

(i) The operators A : TL — > TL and B : TL — > are maximally monotone, 

(ii) A is cr-strongly monotone and j3-Lipschitz continuous. 

First, we state a result that characterizes the resolvent of A. It is proven 
in Appendix [Cj 

Proposition 6.2. Assume that A : TL —> TL is a maximal monotone /3- 
Lipschitz continuous operator. Then the resolvent Ja = (Id + A) -1 satisfies 

2 (J A x- J A y,x-y) > ||x — y\\ 2 + (1 — fl 2 )\\J A x — J A y\\ 2 . (6.1) 


This resolvent property is used when proving the following contraction 
factor of the reflected resolvent. The result is proven in Appendix [C] 


Theorem 6.3. Suppose that A : TL —> TL is a cr-strongly monotone and /?- 
Lipschitz continuous operator. Then the reflected resolvent R A = 2 J A — Id is 
wl — 1 + 2 -contractive. 


The parameter 7 that optimizes the contraction factor for R lA is the 
minimizer of hfl)) := 1 — ( 7 A is 7 cr-strongly monotone and 7 / 3 - 

Lipschitz continuous). The gradient V/i( 7 ) = , which implies 

that the extreme points are given by 7 = ±4. Since 7 > 0 and the gradient 
is positive for 7 > jj and negative for 7 £ ( 0 , 4), 7 = ^ optimizes the 
contraction factor. The corresponding rate is 



_ 47(T _ 

1+27<t+(7 ) S) 2 



2<y/h 

l+cr/13 


1+a/P 


This is summarized in the following proposition. 


0/<r+l ' 
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Proposition 6.4. The parameter 7 that optimizes the contraction factor of 
R 1 a is given by 7 = ^. The corresponding contraction factor is yj ■ 

Now, we are ready to state the convergence rate results for Douglas- 
Rachford splitting. 


Theorem 6.5. Suppose that Assumvtion \6. l\ holds and that the Douglas-Rachford 
algorithm (14.31) is applied to solve 0 £ jAx+jRx. Let S = yj 1 — , 

then the algorithm converges at least with rate factor 

aS ( 6 . 2 ) 


1-a 


for all a £ (0, y^y). Optimizing this bound w.r.t. a and 7 gives a = 1 and 
7 = -| and corresponding optimal rate bound yj ' ■ 

Proof. It follows immediately from Theorem 16.31 Lemma 13.31 and Proposi¬ 
tion E31 by noting that a = 1 minimizes (16.21) . □ 


In the following section, we will see that there exists a problem from 
the considered class that converges exactly with the provided rate. 


6.1. Tightness 

We consider a problem where A is a rotation operator, i.e., the it is given by 


A = d 


cos ip 
sin ip 


— sin ip 
cos ip 


(6.3) 


where 0 < ip < ^ and d £ (0, 00 ). First, we show that A is strongly monotone 
and Lipschitz continuous. 


Proposition 6.6. The operator A in (16.31) is d cos ip-strongly monotone and 
d-Lipschitz continuous. 


Proof. We first show that A is d cos ^-strongly monotone. Since A is linear, 
we have 


(Av ,v) = d( (cos ipv 1 — sin ipv 2 , sin ipv 1 + cos ipV 2 ), (ui, V 2 )) 

= dcosip(vl + u|) = d cos ip\\v\\ 2 . 

That is, A is d cos ^’-strongly monotone. Since A is a scaled (with d) rotation 
operator, its largest eigenvalue is d , and hence A is d-Lipschitz. This concludes 
the proof. □ 


We need an explicit form of the reflected resolvent of A to show that the 
rate is tight. To state it, we define the following alternative arctan definition 
that is valid when tan£ = ^ and x > 0 : 

{ arctan(^) if a; > 0 , y > 0 

arctan(|) + 7 r ifx>0,?/<0 (6.4) 

f x > 0 ,y = 0 
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This arctan is defined for nonnegative numerators x only, and outputs an 
angle in the interval [ 0 , 7 r]. 

Next, we provide the expression for the reflected resolvent. To simplify 
its notation, we let a denote the strong convexity modulus and fi the Lipschitz 
constant of A, i.e., 

cr = dcosi/j, j3 = d. (6.5) 

The following result is proven in Appendix O 

Proposition 6.7. The reflected resolvent of 7 A, with A in (|6.3I) and 7 £ 
( 0 , 00 ), is 

cos £ sin £ 

— sin £ cos £ 


where cr and fi are defined in & and £ satisfies £ = arctan 

with arctan 2 defined in (E3) . 

That is, the reflected resolvent is first a rotation then a contraction. 
The contraction factor is exactly the upper bound on the contraction fac¬ 
tor in Theorem [63] Therefore, the A in ( 16.311 can be used to show tight¬ 
ness of the results in Theorem 16.51 To do so, we need another operator 
B that cancels the rotation introduced by A. For a £ (0,1], we will need 

RjaR'iB = \J 1 — T + 27 g+(T^) ^ an d f° r a > 1 , we will need R^aR-jB = 

1 — i_i_ 2 7 j'+( t l 9 )ii I- This is clearly achieved if R^b is another rotation op¬ 
erator. Using the following straightforward consequence of Minty’s theorem 
(see [ 2 [|) we conclude that any rotation operator (since they are nonexpan- 
sive) is the reflected resolvent of a maximally monotone operator. 

Proposition 6.8. An operator R : R -A TL is nonexpansive if and only if it 
is the reflected resolvent of a maximally monotone operator. 

Proof. It follows immediately from (3j Corollary 23.8] and j3] Proposition 4.2]. 

□ 




RjA — ^/l 


_ 47 CT _ 

I+ 27 CT -f-(7/3 ) 2 


With this in mind, we can state the tightness claim. 

Proposition 6.9. Let 7 £ (0,oo), S = ^Jl — an d £ ^e defined 

as in Proposition \ 6. 7] Suppose that A is as in (|6.3D and B is maximally 
monotone and satisfies either of the following: 

(i) if a £ ( 0 ,1]: B = B\ with R 1 b 1 = 

(ii) a £ ( 1 , 7775 ): B = B 2 with R 1 b 2 = 

Then the z k sequence for solving 0 £ jAx+'yBx using (14.31) converges exactly 
with the rate |1 — a\ + aS. 


cos £ — sin £ 
sin £ cos £ ’ 

cos ( 7 r—£) sin (7r—£) 
— sin (77—£) cos (77—£) 
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Proof. Case (i): Using the reflected resolvent R jA in Proposition l6.7l and that 
a £ (0,1], we conclude that 


z k+1 = (1 - a)z k + aR lA R lB z k 
= (1 — a)z k 


~ k 1 aS 


cos ^ sin £ 


cos £ — sin £ 

— sin £ cos £ 


sin£ cos£ 


= (1 — a)z k + aSz k 


= II -cx\z* 


aSz k 


Case (ii): Using the reflected resolvent R lA in Proposition l6.7l and that a > 1, 
we conclude that 

Z k +i = ^ + aRryAR ^ BZ k 


cos£ 

sin£ 


cos (n — £) 

sin (-7T — £) 

— sin£ 

cos£ 


— sin ( 7 r — £) 

cos ( 7 r — £) 


= (1 — a)z k + aS 

= (1 — a)z k — aSz k 
= —(|1 — a\z k + aS)z k . 

In both cases, the convergence rate is exactly |1 — a\ + a^/l— 1+27 cr+( 7) 8)2 ■ 
This completes the proof. □ 


47 a 


Remark 6.10. It can be shown that the maximally monotone operator B\ that 


gives R lBl satisfies B x = 7(1+ 1 cose) 


if £ € [0,7r) and Bi = <9t 0 


0 — sin ^ 

sin £ 0 

(that is, Bi is the subdifferential operator of the indicator function lq of the 
origin) if £ = 7r. Similarly, the maximally monotone operator B 2 that gives 


R 7 s 2 satisfies B 2 = 7 ( 1 _ 1 cosg) 


0 — sin £ 

sin £ 0 


We have shown that the rate provided in Theorem 
feasible a and 7 . 


if f G ( 0 , 7 r] and B 2 = 0 if £ = 0 . 

is tight for all 


6.2. Comparison to other bounds 

In Figure [I] we have compared the linear convergence rate result in Theo¬ 
rem 16.51 to the convergence rate result in |Tl| . The comparison is made with 
optimal 7 -parameters for both bounds. The result in [23] is provided in the 
standard Douglas-Rachford setting, i.e., with a = 1/2. By instead letting 
a = 1 , this rate can be improved, see [ 8 j (which shows an improved rate in 
the composite convex optimization case, but the same rate can be shown to 
hold also for monotone inclusion problems). Also this improved rate is added 
to the comparison in Figure [U We see that both rates that follow from [23] 
are suboptimal and worse than the rate bound in Theorem 16.51 


7. A strongly monotone and cocoercive 

In this section, we consider the case where A is strongly monotone and coco¬ 
ercive. That is, we assume the following. 
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Figure 1 . Convergence rate comparison for general mono¬ 
tone inclusion problems where one operator is strongly 
monotone and Lipscliitz continuous. We compare Theo¬ 
rem 16.51 to [ 23 ], and an improvement to [ 23 ] which holds 
when a = 1. 


Assumption 7.1. Suppose that: 

(i) The operators A : TL —> TL and B : TL —> are maximally monotone, 

(ii) A is a-strongly monotone and jj-cocoercive. 


The linear convergence result for Douglas-Rachford splitting will follow 
from the contraction factor of the reflected resolvent of A. The contraction 
factor is provided in the following theorem, which is proven in Appendix [D] 

Theorem 7.2. Suppose that A : TL —7 TL is a a-strongly monotone and 4 - 
cocoercive operator. Then its reflected resolvent Ra = 2 Ja — Id is contractive 
with factor 1 - 1+2 t+ g ^- 

When considering the reflected resolvent of 7 A where 7 £ ( 0 ,oo), the 
7-parameter can be chosen to optimize the contraction factor of Rja- The 
operator 7 A is 7<r-strongly monotone and ^-cocoercive, so the optimal 7 > 0 
minimizes h( 7) := 1 — 1 _ l _ 2 7 tr 7 + 7 2 a p • The gradient of h satisfies V/i(7) = 
(// 7 ( 47 CT7 ~ + 1 ^ 1 so the extreme points of h are given by 7 = ±y=. Since 
7 > 0 and the gradient is negative for 7 £ (0, and positive for 7 > ^755, 
the parameter 7 = minimizes the contraction factor. The corresponding 
contraction factor is 



_ 47 a 

l+27£7+7 2 cr/3 



2^73 

1 +\fafi 



y/p/cr-l 

yJJh+T 


This is summarized in the following proposition. 
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Proposition 7.3. The parameter 7 £ (0, 00 ) that optimizes the contraction fac¬ 


tor for RryA is 7 = . The corresponding contraction factor is 


'P/<r -1 

hftjcr+l 


Now we are ready to state the linear convergence rate result for the 
Douglas-Rachford algorithm. 


Theorem 7.4. Suppose that Assumption \7. l\ holds and that the Douglas-Rachford 
algorithm 63) is applied to solve 0 £ "/Ax+jBx. Let S = yj 
then the algorithm converges at least with rate factor 


1 - 


47 cr 


1+27£7+7 2 £t/3 ’ 


|1 — a\ + aS 


(7.1) 


for all a £ (0, yyy)- Optimizing this bound w.r.t. a and 7 gives a = 1 and 


7 = ~^= and corresponding optimal rate bound y Pjp/ +1 ■ 


/p/o-i 


Proof. It follows immediately from Theorem 17.21 Lemma 13.31 and Proposi¬ 
tion 17.31 by noting that a = 1 minimizes (EH)- □ 


7.1. Tightness 

In this section, we provide a two-dimensional example that shows that the 
provided bounds are tight. We let A be the resolvent of a scaled rotation 
operator to achieve this. Let C be that scaled rotation operator, i.e., 

C = c [ C0S ^ ~ ain f\ (7.2) 

sm ip cos ip 

with c £ (l,oo) and ip £ [0, §). We will let A satisfy A = dJc for some 
d £ (0, 00 ). That is 


A-rUr+n- 1 - _ d ccos^ + l c sin ip 

1 ' ' (l+ccosV’P+c 2 sin —csinip C COS Ip + 1 


(7.3) 


In the following proposition, we state the strong monotonicity and cocoer- 
civity properties of A. 


Proposition 7.5. The operator A in (17.31) is 1+c ‘) os, ' ; -cocoercive and strongly 
monotone with modulus ■ 


Proof. The matrix C in (17.21) is c cos ^-strongly monotone (see Proposition l 6 . 6 l) , 
so Jc is (1 + c cos^>)-cocoercive (see [3] Definition 4.4]) and the operator 
A = d(I + C ) _1 is 1 +c b -cocoercive. Further, since C is monotone and 
c-Lipschitz continuous (see Proposition 16.61) . the following holds (see Propo¬ 
sition 16.21) : 

2( J c x - J c y,x - y) > \\x-y\\ 2 + {l-c 2 )\\J c x-J c y\\ 2 - (7.4) 

Since Jc is (1 + ccos^)-cocoercive, we have 

(J c x- J c y,x-y) > (1 + ccosip)\\J c x- Jcy\\ 2 - 


(7.5) 
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We add (17.51) multiplied by — 1 _^ c fi os ^ (which is positive since c £ (l,oo)) to 
(17.41) to get 

( 2 - i+ccos^ )( J ex - Jcy , x-y) > \\x-yf. 

That is, Jc is a-strongly monotone with 
1 1+ccosip 


2 - 


1-+ 


1 + C COS ijj 


1 + c cos ip 
2 + 2c cos if) — 1 + c 2 1 + 2c cos ip + c 2 


so A is strongly monotone with parameter d 1+ 2+co°+tc 2 ■ This concludes the 
proof. □ 


This shows that the assumptions needed for the linear convergence rate 
result in Theorem 17.41 hold. To prove the tightness claim, we need an expres¬ 
sion for the reflected resolvent of A. This is easier expressed in the strong 
convexity modulus, which we define as cr and the inverse cocoercivity con¬ 
stant, which we define as / 3 , i.e.,: 


_ d(l+C COS l/>) q _ 

a ~ l+2ccosp+c 2 ' P ~ 

The following results is proven in Appendix [D] 


d 


1+C COS if) ' 


(7.6) 


Proposition 7.6. The reflected resolvent R 7 a of "/A, where A is defined in 
(1731) and 7 € (0,oo), is given by 


R'y A — A / 1 


dye- 


cos £ — sin£ 

sin £ cos £ 


1 + 27CT + 7 2 a(3 

where a and fl are defined in (17.61) . and £ satisfies £ = arctaii2 
with arctan2 defined in (El- 


Based on this reflected resolvent, we can show that the rate bound in 
Theorem 17.41 is indeed tight. The proof of the following result is the same as 
the proof to Proposition 16.91 

Proposition 7.7. Let 7 £ (0,oo), S = 1 — j+ 2 ^a+ 1 ' i P ’ an ^ ^ ^ as 

defined in Proposition | 7. 6] Suppose that A is as in (17.31) and B is maximally 
monotone and satisfies either of the following: 


cos £ sin £ 

— sin £ cos £ 

cos (7r—£) — sin (77—£) 
sin (77—£) cos (77—£) 


(i) if a £ (0,1]: B = B\ with R 1 b 1 — 

(ii) a £ (1, yr+: B = B 2 with R 7 b 2 = 

Then the z k sequence for solving 0 £ 7712+7.62; using dH converges exactly 
with the rate |1 — a\ + aS. 

So, we have shown that the rate in Theorem 17.41 is tight for all feasible 
algorithm parameters a and 7. 
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Figure 2. Convergence rate comparison between Theo¬ 
rem EHJ Theorem [731 and m- In all, one operator has both 
regularity properties. It is strongly monotone in all examples 
and Lipschitz in Theorem 16.51 cocoercive in Theorem O 
(which is stronger than Lipschitz), and a cocoercive subdif¬ 
ferential operator in [T7| (which is the strongest property). 

The worst-case rate improves when the class of problems 
becomes more restricted. 

7.2. Comparison to other bounds 

We have shown tight convergence rate estimates for Douglas-Rachford split¬ 
ting when the monotone operator A is cocoercive and strongly monotone 
((Theorem 17.411 . In Section 0 we showed tight estimates when A is Lipschitz 
and strongly monotone ("Theorem 16.511 . In [T7|, tight convergence rate esti¬ 
mates are proven for the case when A and B are subdifferential operators 
of proper closed and convex functions and A is strongly monotone and Lips¬ 
chitz continuous (which in this case is equivalent to cocoercive). The class of 
problems considered in inj is a subclass of the problems considered in this 
section, which in turn is a subclass of the problems considered in Section [6] 
The optimal rates for these classes of problems are shown in Figure [7J By re¬ 
stricting the problem classes, the rate bounds get tighter. This is in contrast 
to the case in Section [5j where a convex optimization problem achieved the 
worst case estimate. 


8. Conclusions 

We have shown linear convergence rate bounds for Douglas-Rachford splitting 
for monotone inclusion problems with three different sets of assumptions. One 
setting was the one used by Lions and Mercier |24] , for which we provided a 
tighter bound. We also stated linear convergence rate bounds under two other 
assumptions, for which no other linear rate bounds were previously available. 
In addition, we have shown that all our rate bounds are tight for, in two cases 
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all feasible algorithm parameters, and in the remaining case many algorithm 
parameters. 
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Appendix A. Proofs to Lemmas in Section [3711 

A.l. Proof to Lemma 13.11 

From the definition of cocoercivity, Definition 12.61 it follows directly that 
/3Id + T is TTj-cocoercive if and only if i(/3Id + T ) is 1-cocoercive. This, 
in turn is equivalent to that 2^(^Id + T) — Id = jT is nonexpansive [H| 
Proposition 4.2 and Definition 4.4]. Finally, from the definition of Lipschitz 
continuity, Definition 12.21 it follows directly that is nonexpansive if and 
only if T is /3-Lipschitz continuous. This concludes the proof. 

A.2. Proof to Lemma 13.21 

Let T\ = R— |ld. Then Lemma ETTl states that -g-cocoercivity of Ti + ^Id = R 
is equivalent to ^-Lipschitz continuity of T\ = R — ^Id. By definition of Lip¬ 
schitz continuity, this is equivalent to that T\ = ^T 2 for some nonexpansive 
operator T2. Therefore T = R+ (1 — /3)Id = T\ + (1 — §)Id = ^T2 + (1 — §)Id. 
Since /3 € (0,1), this is equivalent to that T is ^-averaged. This concludes 
the proof. 

A. 3. Proof to Lemma 13.31 

Let x and y be any points in R. Then 

\\Rx — Ay|| = ||(1 — a)x + aTx — (1 — a)y — aTy\\ 

< |l-a|||x- 2 /|| + \a\\\Tx-Ty\\ 

< |l-a|||x-y|| + H<5||ir - y\\ 

= (|1 - a| + |a|5)||x-y||. 

So R is (|1 — a|||x — y\\ + |a|(5)-Lipschitz continuous. The Lipschitz constant 
is less than 1 if a £ (0, yyy). For such a, R is contractive. Since a > 0, the 
contraction factor is (|1 — a\ + aS). This concludes the proof. 

Appendix B. Proofs to results in Section [5] 

B. l. Proof to Proposition 15.21 

Since B is -^-cocoercive, it satisfies 

(Bu — Bv, u — v) > ^H-Bu — Bv || 2 . 
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Adding ||u — z;|| 2 to both sides gives 

((B + Id)u — (B + Id)u,u — v) > ||zi — u|| 2 

+ i||(i? + Id)it — (B + Id)u - (it - u)|| 2 . 

Letting x = (B + Id)it and y = (B + Id)u implies that u = JbX and v = JbV- 
Therefore, we get the equivalent expression 

(x - y, J B X - J B y) > II JbX - J B y \\ 2 + j\\x-y- ( Jbx - Jbv) II 2 - 

Expansion of the second square gives 

/3(x - y, J B x - J B y) 

> /3\\J b x - J B y || 2 + ||a; - y|| 2 - 2(x - y, J B x - J B y) + \\Jbx - J B y || 2 , 
or equivalently 

(P + 2){x - y , J B x - J B y) > ||* - y|| 2 + (P + 1)|| Jbx - J B y || 2 

^ f$i(* - y, Jbx - J B y) > ||* - y|| 2 + || Jbx - JBy\\ 2 

«=> ( 2 ( x ~ 2(1 +p) ))( x ~yi J BX- J B y) 

> (! - 2 +TT))ll* - vf + II Jbx - J B y\\ 2 - 

This is, by [3, Proposition 4.25], equivalent to that J B is 2 (/3+i) ~ avera g e d- 
This concludes the proof. 


B.2. Proof to Theorem 15.61 

+7/3 

Since RjaRjB is —-—negatively averaged, see Proposition l5.51 the Douglas- 

—+P'/3+ 1 

-A+7/3 

Rachford iteration is defined by an a-averaged - 

^+7/3+1 

operator. The rate in (15.11) follows directly from Proposition 13.91 The opti¬ 
mal parameters follow from Proposition 13.101 It shows that the rate factor 


-negatively averaged 


is increasing m — r 


— +7/3 
7 a ' H 


^+7/?+! 


which in turn is increasing in + 7/3. Therefore 


this should be minimized to optimize the rate. The optimal 7 = -^= gives 


negative averagedness factor 


+7/3 


7<7 


_ _ 2 x //3/<t 

+7/3+1 1+2 \fWJa 


. Proposition 13.101 further 


gives that the optimal averagedness factor is 

_ 1 _ l+2y7/V _ 1 /2+y(8/V 


2- 


2y /3/cr 


2+2W/3/a 


!+v/ A/cr 


\+2yJJJ^ 

and that the optimal bound on the contraction factor is 

//3/ct 


2 y / W [&+1 _ 
ly/fi/a ~ 2+2yfpJo ~ l+^/p/a' 

2 
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This concludes the proof. 


B.3. Proof to Proposition 15.71 

The proximal and reflected proximal operators of / are trivially given by 

prox 7/ (y) = { T p^yi,y 2 ), R~, f (y) = (B.l) 

Linearity of the proximal operator and Moreau’s decomposition [3, Theo¬ 
rem 14.3] imply that the reflected resolvent of /* is given by 


R-yf* = 2prox 7 j, — Id = 2 (Id — 7prox 7 -i f o (7 1 Id)) — Id = —(2prox 7 -i f — Id) 

= —Rry-lf. 

This gives the following Douglas-Rachford iteration: 

z k+1 = ((l - a )Id + aR lf R 1 f*)z k 
= (1 — a)z k — aR 7 f Rry-if z k 

=(1 - a > k - a z 2) ■ 


Since we start at a point z° = (0,z!j), we will get z k = 0 for all k > 1, and 
the Douglas-Rachford iteration becomes 

z k+1 = (1 - 2a) z k 


with contraction factor given by 11 — 2a|. 

When a £ [c, 1), the absolute value term in (15.11) is nonpositive since 


( 1 - 2a+ > 


— +7/3 

) < 0 


<(=> 


a > 


;+7/3 


2 —- 


1+ —+7/3 

7 (7 

2+ — +7/3 

1 7CT 




Therefore, for such a, the rate in (15.11) is 11 — 2a|. This coincides with the 
rate for the provided example for any 7 > 0, and the proof is completed. 


Appendix C. Proofs to results in Section [H 
C.l. Proof to Proposition 16.21 

/3-Lipschitz continuity of A implies that /3Id+A is ^-cocoercive, see Lemma l3. ll 
That is 

((/3Id + A)u — (/3Id + A)v,u — v) > ^||(/IId + A)u — (/lid + A)i>|| 2 . 

Using /3Id = Id + ((3 — l)Id, this is equivalent to that 

((Id + A)u - (Id + A)v,u — v)> ^|| (Id + A)u - (Id + A)v + (/? - l)(it - -y)|| 2 

+ (l-f3)\\u-v\\ 2 . 
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Using that x = (Id + A)u if and only if u = (Id + A)~ 1 x and y = (Id + A)v 
if and only if v = (Id + A)~ x y (that hold by definition of the inverse and 
single-valuedness), this is equivalent to 

(x — y, (Id + A)~ 1 x— (Id + A) _1 y) 

> jp\\x ~ V + (P ~ l)((Id + A)~^x - (Id + A)~ l y)\\ 2 
+ (1 — /3)|| (Id + A)- 1 x — (Id + A) _1 y|| 2 - 

Identifying the resolvent J A = (Id + A) -1 and expanding the first square 
give: 

(J A x - J A y, x — y) 

> jp\\x - y + (/3 - 1 )(J A x - J A y )|| 2 + (1 - f3)\\J A x - J A y\\ 2 
= Jp (Ik - y|| 2 + 2(/3 - 1 )(J A x - J A y,x — y) + (/3 - 1) 2 || J A x - J A y \\ 2 ) 

+ (1-(3)\\J a x- J A y\\ 2 

By rearranging the terms, we conclude that 

(! + 1 yr )( j ax - Jav, x-y)>jp (Ik - yll 2 + {P - i) 2 \\Jax - J^yll 2 ) 

+ (1 — P)\\Jax — J A y\\ 2 
= ^?lk - y|| 2 + ( d-b jMLJl \\j AX _ J A y 11 2 
= ^lk - v \\ 2 + ^f-\\ J AX - JavW 2 - 

The result follows by multiplying by 2/3, since 1 + = jj. This concludes 

the proof. 

C.2. Proof to Theorem 16.31 

We divide the proof into two cases, /3 > 1 and /3 < 1. 

Case /3 > 1 

From [3] Proposition 23.11], we get that J A is (1 + er)-cocoercive, i.e., that 
(■ J A x - J A y,x~ y) > (1 + cr)|| J A x - J A y\\ 2 . (C.l) 

Adding (/ 3 2 — 1)(> 0) of (1C.Ill to (1 + a) of (16.11) . we get 

(2(1 + a) + (/ 3 2 - 1 )){J A x-J A y,x - y) > (1 + cr)|k - y \\ 2 
or equivalently 

{J A x-J A y,x- y) > 1+ \tlpa Ik ^ y\\ 2 (C.2) 
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since the \\Ja% — JaV || terms cancel. We get 

II Rax - RaV || 2 = ||2 Jax - 2J A y - (x - y )|| 2 

= 4|| J A x - J A y || 2 - 4 (J A x - J A y,x- y) + \\x - y\\ 2 
< 4 ( 1 ^ 3 : - 1 )(Jax - J A y , x-y) + \\x- y\\ 2 
= -t^(Jax~ J A y,x-y) + \\x-y\\ 2 
<-&T^\\x-y\\ 2 + \\x-y\\ 2 
= (l- TT ^)\\x-y\\ 2 (C.3) 

where ED and EH are used in the inequalities. Thus, the said result holds 
for j3 > 1. 


Case /3 < 1 

To prove the result for /3 < 1, we define the set 1Z of pairs of points (x, y) £ 
R x R as follows: 

R = {(£,2/) I {Jax- J A y,x-y) > 1+ )£+p‘i II* ~ 2/II 2 } ■ ( c - 4 ) 


We also define the closure of the remaining pairs of points TZ C = {R x R)\TZ, 
i.e., 

Re = {{x,y) | ( J A x - J A y, x - y) < \\x - y|| 2 } . (C.5) 

Obviously, R xR C7Z +7Z C which implies that the contraction factor of the 
resolvent is the worst-case contraction factor for 1Z and 7 Z c . We first show 
the contraction factor for 7 Z. Since E3 is the definition of the set 7 Z in 
(1C. 41) . the contraction factor for (x,y) £ 1Z is shown exactly as in (1C. 31) . For 
(x, y) £ 7 Z c , we have 


\\R a x - R A y\\ 2 = \\2J a x - 2 J A y - {x - y )|| 2 

= 4|| J A x - J A y\\ 2 - 4{ J A x - J A y , x-y) + \\x- y \\ 2 
< ( 2 ( j ax- J A y,x-y) - ||x-y|| 2 ) 

- 4 (J A x - J A y,x -y) + ||x - y\\ 2 
= 4 (1352 - l) {Jax - J A y,x - y) + (l - ||x - y\\ 2 


< 1 - 


4 


i i+r 


l+o- 


T-/3 2 T +2a+{P 


I 1 _ 4(l+2g+/3 2 )-4(l+ / 3 2 )(l+ CT ) ^ 
- 1 1 (l-/3 2 )(l+2rr+/3 2 ) ) 


4+8<x+4/3 2 —(4-f-4/3 2 +4er-t-4cr/3 2 ) ^ 
V 1 (l-/3 2 )(l+2a+/3 2 ) ,1 


Ik-yll 2 
lk-yf 
\\x-y \\ 2 


= 1 - 


M1-/3 2 ) 


(l-/3 2 )(l+2cr+/3 2 ) 


) lk-2/ll 5 


_ | 1_ 4<t 

l l+2o-+/3 2 


) \\x-yf 
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where (16.Ill is used in the first inequality and the definition of 1Z C in (1C.51) in 
the second. That is, the worst case contraction factor is ^/l — also 

for 0 < 1. 

It remains to show that the contraction factor is in the interval [0,1). 
We show that the square of the contraction factor is in [0,1). We have 1 — 

1 + 2 &+b : = 1+2 CT+a 1 2 < 1- Further, since a < 0, we have 1 — 2a+ 0 2 > 1 —2cr + 
a 2 = (1 — a) 2 > 0. So the numerator is nonnegative and the denominator 
is positive, which gives a nonnegative contraction factor. This concludes the 
proof. 


C.3. Proof to Proposition 16.71 

First, we compute the resolvent J lA - It satisfies 


J 1 a = (I + 1 A)- 1 = 


1 


1 + 2yd cos ip - 
1 


1 + yd cos ip —yd sin ip 
yd sin ip 1 + yd cos ip 

1 +yd cos 1/3 'ydsmip 

—yd sin ip 1 + yd cos ip 

y/3 sin ip 


1 + 2ycr + (y/3) 2 
The reflected resolvent is 

R'yA — I 

2 


(yd) 2 

1 + ycr 
—y/3 sin ip 


1 


- y a 


1 + 2ycr + (y/3) 2 

2 

1 + 2ycr + (y/3) 2 
where we have used 


l+2 7 q-+(7/3) 2 

2 


1 + ycr 

—y/3 sin ip 

i(i-(y^) 2 ) y/3 sin ip 

-y/3 si nip ^(1 - (y/3) 2 ) 


y/3 sin ip 

1 + ycr - I±2l£+Mlf 


(1 + ycr) 2 + (y/3) 2 sin 2 ip = (1 + y/3 cosip) 2 + (y/3) 2 sin 2 ip = 1 + 2ycr + (y/3) 2 . 
Since y/3 sin ip is nonnegative, this implies 

y/3 sin ip = \Jl — 2ja + (y/3) 2 — (1 + ycr) 2 = y-y/ 0 2 — cr 2 . 
Therefore, the reflected resolvent is 

2 


RryA - 


_ '1(1 - (7/3) 2 ) yy//3 2 - o- 2 

1 + 2ycr + (y/3) 2 |_-yV/3 2 - cr 2 |(1 - (y/3) 2 ). 


Now, let us introduce polar coordinates of the elements: 

<5(cos£,sin£) = (§(1 - (y/3) 2 ), yy//3 2 - cr 2 ) 
which gives reflected resolvent 


RjA — 
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1 + 2ycr + (y/3) 2 


cos 0 sin 0 
— sin £ cos £ 


(C.6) 
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The angle £ in the polar coordinate satisfies 

tin f - 2 7x //32_ CT 2 

tan^ - i_ {t/3 )2 


27%//3 2 -o 

l l-(7d) 2 


where 


and since the numerator is nonnegative, £ = arctan2 
arctan2 is defined in (16.41) . For the radius <5 in the polar coordinate, we get 
S 2 = S 2 (cos 2 ip + sin 2 ip) 

= !(l-(7/3) 2 ) 2 +7 2 (I 3 2 -* 2 ) 

= i(l-2( 7 /3) 2 + ( 7 /3) 4 )+ 7 2 (/3 2 -a 2 ) 

= |(1 - 2 7 er + ( 7 /?) 2 )(l + 2 7 cr + ( 7 /3) 2 ) 

and (since <5 > 0) 

25 = \/(l — 2 7 <j + ( 7 /3) 2 )( 1 + 2 7 cr + ( 7l d) 2 ) (C.7) 

It remains to compute the factor in (1C.61) . Using (1C.71) . we get 

2(5 _ y/{l- 2 7 <j + ( 7 /3) 2 )(l + 2 7 ct + ( 7 ^) 2 ) 


1 + 2 7 ct + ( 7 /3) 2 


1 + 2 7 <r + ( 7 /3) 2 


' 1 — 2 7 ct + ( 7 /3) 2 
1 + 2 7 ct + ( 7l d) 2 


= 4/1- 


4 7 ct 


1 + 2 7 cr + ( 7 /3) 2 


This completes the proof. 


Appendix D. Proofs to results in Section |7j 

D.l. Proof to Theorem l7.2l 

We know from Lemma 13.61 and Definition 12.61 that J A is (1 + cr)-cocoercive, 
i.e., that it satisfies 

{J A X - J A y,x- y) > (1 + <t)\\J a x - J A y\\ 2 (D.l) 

for all x,y £ H. From Proposition 15.21 we know that J A is 2 (i+/j) ~ avera g ec f 
i.e., that it satisfies (see [3( Proposition 4.25(iv)]) 

2 (1 “ 2 (vfp))( jAX ~ JAy,x-y) > (1 - j^)\\x - y\\ 2 + \\J A x - J A y\\ 2 

(D.2) 

for all x, y £ H. Let a = 2 (i+f)) an( l ^ = TTF an d define the set 1Z of pairs of 
points (x, y) £ 'H x H as: 

{(x,y) I (J A x- J A y,x-y) > (f + (1 ~ 0 ‘~ 2 S ( / 1 2 } a l^ / t) {S/2) ) Ik - y|| 2 } ■ 

(D.3) 
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We also define the closure of set of remaining pairs of points TZ C = (7t x 'H)\1Z , 
i.e., 

= (0d2/) I (Jax- J A y,x-y) < (f + (1 -^-^-«/+W 2 ) )|| x -y|| 2 |. 

(D.4) 

Obviously, the contraction factor for Ra is the worst-case contraction factor 
for pairs of points in 1Z and TZ C . 

Contraction factor on 7Z 

First, we provide a contraction factor for pairs of points in 7 Z. Since Ra = 

2 Ja — Id, we have 

|| Rax - RaV || 2 = 4|| J A x - Jav\\ 2 - 4(J A x - JaV,x- y) + ||x - y|| 2 

< (4<5 - 4 ){J A x - JaV, x-y) + \\x- y\\ 2 

< (<(« - i) (I + +1) II* - sll 2 

where <5 = j £ (0,1) and a = 2 {i+ii) anc ^ th e inequalities follow from (ID.II) 
and the definition of 7Z in (ID. 31) . 

Contraction factor on 7Z C 

Next, we provide a contraction factor for pairs of points in 7 Z c . Since Ra = 

2 J a — Id, we conclude that 

II Rax - Rav\\ 2 

= 4|| J A x - J A y II 2 - 4 (J A X - J A y, x-y) + ||x - y|| 2 

< (4(2(1 - a) - 1 ){Jax - JaV,x- y) + (1 - 4(1 - 2a)))||a; - y|| 2 

< (4(1 - 2a) (| + + 1 - 4 + 8a) ||x - y\\ 2 , 

where we have used that a £ (0, ^), (ID. 21) . and the definition of 1Z C in (ID. 41) . 
Contraction factor of Ra 

Here, we show that the contraction factors on 7 Z and 7 Z c are identical, and 
we simplify the expression to get a final contraction factor for the reflected 
resolvent Ra ■ That the contraction factors are identical is shown by verifying 
that the difference between is zero: 

4(1 - 2a) (| + (1 ~ a ~ 2 ;fX:g 2 Y^ /2)2 ) + 1 - 4 + 8a 

- 4(g -1) (| + (1 ~ a ~ 2 g:g 2 y g/2)2 ) -1 

= 4 ( 2 - 2a - (5) (| + {1 -%%Z-f/ 2 ) {5/2)2 ) - 4 + 8a 

= 2(2 - 2a - (5)<5 + 4 ((1 - a - S/2) 2 - a 2 + (<5/2) 2 ) - 4 + 8a 

= 2(2 - 2a - 6)6 + 4 ((1 - 2a - 5 + aS + a 2 + (S/2) 2 - a 2 + (<5/2) 2 ) - 4 + 8a 

= 45- 4aS - 2S 2 + 4 - 8a - 46 + 4aS + 2<5 2 - 4 + 8a = 0. 
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Next, we simplify this contraction factor by inserting 5 = and a = 

2(rm- We s et 


4 (^ -1) (f + (1 ~ a - 2 u-«:?y /2)2 ) + 1 


2 ( 1+0 


C /3 

1 > 

2 

- 

( P \ 

2 + f , 1 f \ 

V 2(l+/3) 2(l+o) J 

V 2(l+/3) J 

+ Ui+p)) 

2 

( x P 

V 2(l+/3) 

1 

2(l+o) 



+ 1 


4(l+p 1 )(^2(1 +ct) 

1 -I+^-TT^ + 2(l+4(l+^)+(2(H7g)) +( 2 ( 7757 ) "( 20 +^ 1 ) + ( 2( l+o) ) 


2 1- 



g 1 

2(l+/3) 2(1+0") 

P 


'1+g l+o^~2(l+o)(1+/5) ' 2(l+o) 2 1 , 

■ ;j | - 1 + 1 

2(l+j8) 2(l+o) 

2(l+/3)~ 2(l+o) )+ 2 ( 1 + ct )( 1 -T+^-T+7 :+ 2(1+o)(1+/3) + 2(1+o) 2 


4(1+<t)^ 1-2(4,3) - 2 (l+o)) 


( 1_ 2(l+/3) ~ 2(l+o) ) + 2 (( 1 + gr )~ 


g(l±£) 

1+/3 


- 1 + 


2+2cr — 


4 (1+ ct )( 1 - 2 ( 1 %) _ 2( 1 +o 

2ff(o+l) 


P I 1 V 

2(l+/3) + 2(l+o) J 


+ 1 


1+g 


= 4(t^-1) 


4( l+ <T) (l- 2(1 +(i)-2(l+o)) 

2/3+2+2/3o+2o—2/3(o+l) 


+ 1 


^ 4 ( 1 + ct )( 1 +/ ? ) 2(1+/1) “ 2(l+o) ) 

4(lTo - 4) ( 2 (2(l+o)(l+/3)-f^l+o)-(l+/3)) 

4(lTo ~ 1) ( 2( l+2o+^o) ) + 1 

\( — o ) / 2+20 

11 + 0 / y 2( l+2o+/3o) 


+ 1 


2+2cr 

^ 2(2+2(7+2/3+2/3cr—/3 —/3cr—1—/3) 

2(l+o) 


+ 1 
+ 1 


= 1 - 


4(7 


1 + 2(7+/3(7 


Taking the square root concludes the proof. 
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D.2. Proof to Proposition 17.61 

We start by computing the resolvent and reflected resolvent of 7 A. The re¬ 
solvent of 7^4 is given by 


J'yA — {I + 7 + 


-1 


7 <i(l+c cos 0 ) 1 ^ 

( 1 +ccos 0 ) 2 +c 2 sin 2 0 ' 
_ 7 dc sin 0 _ 

( 1 +ccos 0 ) 2 +c 2 sin 2 0 


7 <i(l+ccos 0 ) 
1+2 c cos 0 +c 2 

7 <jc sin 0 


+ 1 


_ 7 rfcsin 0 _ 

( 1 +ccos 0 ) 2 +c 2 sin 2 0 
7 d(l+ccos 0 ) . ^ 

( 1 +c cos 0 ) 2 +c 2 sin 2 0 _r " 

-| -1 


7 dcsin 0 


l+ 2 c cos 0 +c 2 


7 d(l+ccos - 0 ) 
1+2CCOS0+C 2 1 + 2CCOS0+C 2 


+ 1 


7cr + 1 

_ 7dcsin 0(7 

1 +c cos 0 

7(7 + 1 

—'ycsinipa/3/d 


7 c sin ipa 
1 +c cos -0 

7CT+1 


-1 


7c sin ipcrfi/d 


( 7 <T-t-l)'+( 7 c<r /3 sin ijj/d) 2 


7(7 + 1 

7(7 + 1 — 'yea (3 smi/j/d 

'y ca/3 sin %/j/d 7(7 + 1 


where ct and /3 are defined in ( 17 + 1 ) . The reflected resolvent i? 7 ^ is given by 


R^A — I 


( 7 cr+l ) 2 + ( 7 ccr/3 sin 0 /c £) 2 

. ( 7 a+l ) 2 + ( 7 c>T /3 sin i/>/d ) 2 

1 “ T1 2 

7CCT/3 sin ip/d 
To simplify this expression, we note that 


—7CCT/3 sin ip/d 

, _ ( 7 tT+l ) 2 + ( 7 ctr /3 sin Tp/d) 2 
ju ~r ± 2 


+(7 = 


_ d(l+ 2 ccos 0 +c 2 ) _ (l+ 2 ccos 0 +c 2 ±c 2 cos 2 0 ) _ 


= 1 + 


c 2 (l—cos 2 0 ) 
(1+CCOS 0 ) 2 


= 1 


2 • 2 / 
c sin 0 


(1 + CCOS 0) 2 


= 1 + 


/3 2 c 2 sin 2 0 
d 2 


This implies that 

(7a + l) 2 + (7 ca/3 sin i/j/d) 2 = 1 + 270- + (7cr) 2 (l + c 2 /3 2 sin 2 4>/d 2 ) 

= 1 + 27(7 + o+y 2 . (D.5) 

Using this equality, we can simplify the reflected resolvent expression: 


R'yA - 


_ 01 - /3(77 2 ) -7+++^) 

1+270-+CT/37 2 7 y(1(1 — /?(77 2 ) 


since 7C(7/3 sin 'ip/d > 0. Now, let us write the matrix elements using polar 
coordinates: 

<5(cos£,sin£) = (±(1 - o+y 2 ),7V+/3 - (7)) . 
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cos £ — sin £ 

sin£ cos£ 


(D.6) 


This gives the reflected resolvent: 

R~1 a — l+2ja+cr l 3j 2 

The angle £ in the polar coordinates satisfies 

The numerator is always nonnegative, so £ is given by £ = arctan 2 [' 
with arctan 2 defined in (IQ) . The radius <5 in the polar coordinates satisfies 
d 2 = d 2 ( cos 2 £ + sin 2 £) 

= {±^) 2 +l 2 a(P-v) 

= 1 _ 2 &! + W 2 ! l ! + ^ 7 2 _ W 2 


— 4 + 


2 

o~/37 : 


(^7 


2 1 4 

w 2 


- (TO") 2 


= |(1 — 27 cr + 7 2 a^)(l + 27a + 7 2 ct/ 3) 


and (since <5 > 0) 


2d = -^/(l — 2717 + 7 2 cr/3)(l + 27a + 7 2 <j/ 3). (D.7) 

It remains to compute the factor in (ID.61) . Using (ID. 71) . we conclude 
2d \/( 1 — 27 a + 7 2 a/3)(l + 27 a + 7 2 a/ 3 ) 


1 + 270- + a /3 7 2 


1 + 270- + 7 2 < t /? 


' 1 — 2717 + 7 2 a (3 
1 + 27(7 + 7 2 <t/I 


= i/I — 


4717 


1 + 27(7 + 7 2 ct/: 


This completes the proof. 
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